Task 4 : Location-based Analysis¶
Objective :
- Perform a geographical analysis of the restaurants in the dataset.
Steps :
- Explore the latitude and longitude coordinates of the restaurants and visualize their distribution on a map.
- Group the restaurants by city or locality and analyze the concentration of restaurants in different areas.
- Calculate statistics such as the average ratings, cuisines, or price ranges by city or locality.
- Identify any interesting insights or patterns related to the locations of the restaurants.
Import necessary Libraries and Data Loading¶
In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
import folium
from folium.plugins import MarkerCluster
from folium.plugins import HeatMap
# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
In [2]:
# Data Loading (CSV file)
dataset = pd.read_csv(r"E:\Cognify\Dataset .csv")
dataset.head(5)
Out[2]:
| Restaurant ID | Restaurant Name | Country Code | City | Address | Locality | Locality Verbose | Longitude | Latitude | Cuisines | ... | Currency | Has Table booking | Has Online delivery | Is delivering now | Switch to order menu | Price range | Aggregate rating | Rating color | Rating text | Votes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6317637 | Le Petit Souffle | 162 | Makati City | Third Floor, Century City Mall, Kalayaan Avenu... | Century City Mall, Poblacion, Makati City | Century City Mall, Poblacion, Makati City, Mak... | 121.027535 | 14.565443 | French, Japanese, Desserts | ... | Botswana Pula(P) | Yes | No | No | No | 3 | 4.8 | Dark Green | Excellent | 314 |
| 1 | 6304287 | Izakaya Kikufuji | 162 | Makati City | Little Tokyo, 2277 Chino Roces Avenue, Legaspi... | Little Tokyo, Legaspi Village, Makati City | Little Tokyo, Legaspi Village, Makati City, Ma... | 121.014101 | 14.553708 | Japanese | ... | Botswana Pula(P) | Yes | No | No | No | 3 | 4.5 | Dark Green | Excellent | 591 |
| 2 | 6300002 | Heat - Edsa Shangri-La | 162 | Mandaluyong City | Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal... | Edsa Shangri-La, Ortigas, Mandaluyong City | Edsa Shangri-La, Ortigas, Mandaluyong City, Ma... | 121.056831 | 14.581404 | Seafood, Asian, Filipino, Indian | ... | Botswana Pula(P) | Yes | No | No | No | 4 | 4.4 | Green | Very Good | 270 |
| 3 | 6318506 | Ooma | 162 | Mandaluyong City | Third Floor, Mega Fashion Hall, SM Megamall, O... | SM Megamall, Ortigas, Mandaluyong City | SM Megamall, Ortigas, Mandaluyong City, Mandal... | 121.056475 | 14.585318 | Japanese, Sushi | ... | Botswana Pula(P) | No | No | No | No | 4 | 4.9 | Dark Green | Excellent | 365 |
| 4 | 6314302 | Sambo Kojin | 162 | Mandaluyong City | Third Floor, Mega Atrium, SM Megamall, Ortigas... | SM Megamall, Ortigas, Mandaluyong City | SM Megamall, Ortigas, Mandaluyong City, Mandal... | 121.057508 | 14.584450 | Japanese, Korean | ... | Botswana Pula(P) | Yes | No | No | No | 4 | 4.8 | Dark Green | Excellent | 229 |
5 rows × 21 columns
Data Pre-processing¶
Missing Value Data Analysis
In [3]:
# Check columns of the dataframe
print(list(dataset.columns))
['Restaurant ID', 'Restaurant Name', 'Country Code', 'City', 'Address', 'Locality', 'Locality Verbose', 'Longitude', 'Latitude', 'Cuisines', 'Average Cost for two', 'Currency', 'Has Table booking', 'Has Online delivery', 'Is delivering now', 'Switch to order menu', 'Price range', 'Aggregate rating', 'Rating color', 'Rating text', 'Votes']
In [4]:
# Check null count and data types
dataset.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 9551 entries, 0 to 9550 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Restaurant ID 9551 non-null int64 1 Restaurant Name 9551 non-null object 2 Country Code 9551 non-null int64 3 City 9551 non-null object 4 Address 9551 non-null object 5 Locality 9551 non-null object 6 Locality Verbose 9551 non-null object 7 Longitude 9551 non-null float64 8 Latitude 9551 non-null float64 9 Cuisines 9542 non-null object 10 Average Cost for two 9551 non-null int64 11 Currency 9551 non-null object 12 Has Table booking 9551 non-null object 13 Has Online delivery 9551 non-null object 14 Is delivering now 9551 non-null object 15 Switch to order menu 9551 non-null object 16 Price range 9551 non-null int64 17 Aggregate rating 9551 non-null float64 18 Rating color 9551 non-null object 19 Rating text 9551 non-null object 20 Votes 9551 non-null int64 dtypes: float64(3), int64(5), object(13) memory usage: 1.5+ MB
In [5]:
# Handle Missing Values
dataset.isnull().sum()
Out[5]:
Restaurant ID 0 Restaurant Name 0 Country Code 0 City 0 Address 0 Locality 0 Locality Verbose 0 Longitude 0 Latitude 0 Cuisines 9 Average Cost for two 0 Currency 0 Has Table booking 0 Has Online delivery 0 Is delivering now 0 Switch to order menu 0 Price range 0 Aggregate rating 0 Rating color 0 Rating text 0 Votes 0 dtype: int64
- In this dataset, 9 values of the Cuisines were empty, which was removed.
In [6]:
# Drop rows where 'Cuisines' is missing and restore in new dataframe
refine_data = dataset.dropna()
refine_data.head(5)
Out[6]:
| Restaurant ID | Restaurant Name | Country Code | City | Address | Locality | Locality Verbose | Longitude | Latitude | Cuisines | ... | Currency | Has Table booking | Has Online delivery | Is delivering now | Switch to order menu | Price range | Aggregate rating | Rating color | Rating text | Votes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6317637 | Le Petit Souffle | 162 | Makati City | Third Floor, Century City Mall, Kalayaan Avenu... | Century City Mall, Poblacion, Makati City | Century City Mall, Poblacion, Makati City, Mak... | 121.027535 | 14.565443 | French, Japanese, Desserts | ... | Botswana Pula(P) | Yes | No | No | No | 3 | 4.8 | Dark Green | Excellent | 314 |
| 1 | 6304287 | Izakaya Kikufuji | 162 | Makati City | Little Tokyo, 2277 Chino Roces Avenue, Legaspi... | Little Tokyo, Legaspi Village, Makati City | Little Tokyo, Legaspi Village, Makati City, Ma... | 121.014101 | 14.553708 | Japanese | ... | Botswana Pula(P) | Yes | No | No | No | 3 | 4.5 | Dark Green | Excellent | 591 |
| 2 | 6300002 | Heat - Edsa Shangri-La | 162 | Mandaluyong City | Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal... | Edsa Shangri-La, Ortigas, Mandaluyong City | Edsa Shangri-La, Ortigas, Mandaluyong City, Ma... | 121.056831 | 14.581404 | Seafood, Asian, Filipino, Indian | ... | Botswana Pula(P) | Yes | No | No | No | 4 | 4.4 | Green | Very Good | 270 |
| 3 | 6318506 | Ooma | 162 | Mandaluyong City | Third Floor, Mega Fashion Hall, SM Megamall, O... | SM Megamall, Ortigas, Mandaluyong City | SM Megamall, Ortigas, Mandaluyong City, Mandal... | 121.056475 | 14.585318 | Japanese, Sushi | ... | Botswana Pula(P) | No | No | No | No | 4 | 4.9 | Dark Green | Excellent | 365 |
| 4 | 6314302 | Sambo Kojin | 162 | Mandaluyong City | Third Floor, Mega Atrium, SM Megamall, Ortigas... | SM Megamall, Ortigas, Mandaluyong City | SM Megamall, Ortigas, Mandaluyong City, Mandal... | 121.057508 | 14.584450 | Japanese, Korean | ... | Botswana Pula(P) | Yes | No | No | No | 4 | 4.8 | Dark Green | Excellent | 229 |
5 rows × 21 columns
In [7]:
# Check for duplicates
refine_data.duplicated().sum()
Out[7]:
0
In [8]:
# Validate Co-ordinates
refine_data = refine_data[(refine_data['Latitude'].between(-90, 90)) & (refine_data['Longitude'].between(-180, 180))]
In [9]:
# Reset Index after clean up
refine_data.reset_index(drop=True, inplace=True)
In [10]:
# Verify process data
refine_data.head(5)
Out[10]:
| Restaurant ID | Restaurant Name | Country Code | City | Address | Locality | Locality Verbose | Longitude | Latitude | Cuisines | ... | Currency | Has Table booking | Has Online delivery | Is delivering now | Switch to order menu | Price range | Aggregate rating | Rating color | Rating text | Votes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6317637 | Le Petit Souffle | 162 | Makati City | Third Floor, Century City Mall, Kalayaan Avenu... | Century City Mall, Poblacion, Makati City | Century City Mall, Poblacion, Makati City, Mak... | 121.027535 | 14.565443 | French, Japanese, Desserts | ... | Botswana Pula(P) | Yes | No | No | No | 3 | 4.8 | Dark Green | Excellent | 314 |
| 1 | 6304287 | Izakaya Kikufuji | 162 | Makati City | Little Tokyo, 2277 Chino Roces Avenue, Legaspi... | Little Tokyo, Legaspi Village, Makati City | Little Tokyo, Legaspi Village, Makati City, Ma... | 121.014101 | 14.553708 | Japanese | ... | Botswana Pula(P) | Yes | No | No | No | 3 | 4.5 | Dark Green | Excellent | 591 |
| 2 | 6300002 | Heat - Edsa Shangri-La | 162 | Mandaluyong City | Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal... | Edsa Shangri-La, Ortigas, Mandaluyong City | Edsa Shangri-La, Ortigas, Mandaluyong City, Ma... | 121.056831 | 14.581404 | Seafood, Asian, Filipino, Indian | ... | Botswana Pula(P) | Yes | No | No | No | 4 | 4.4 | Green | Very Good | 270 |
| 3 | 6318506 | Ooma | 162 | Mandaluyong City | Third Floor, Mega Fashion Hall, SM Megamall, O... | SM Megamall, Ortigas, Mandaluyong City | SM Megamall, Ortigas, Mandaluyong City, Mandal... | 121.056475 | 14.585318 | Japanese, Sushi | ... | Botswana Pula(P) | No | No | No | No | 4 | 4.9 | Dark Green | Excellent | 365 |
| 4 | 6314302 | Sambo Kojin | 162 | Mandaluyong City | Third Floor, Mega Atrium, SM Megamall, Ortigas... | SM Megamall, Ortigas, Mandaluyong City | SM Megamall, Ortigas, Mandaluyong City, Mandal... | 121.057508 | 14.584450 | Japanese, Korean | ... | Botswana Pula(P) | Yes | No | No | No | 4 | 4.8 | Dark Green | Excellent | 229 |
5 rows × 21 columns
EDA - Exploratory Data Analysis¶
Statistical Data Analysis and Visual Data Analysis
In [11]:
# Descriptive Statistics
refine_data.describe()
Out[11]:
| Restaurant ID | Country Code | Longitude | Latitude | Average Cost for two | Price range | Aggregate rating | Votes | |
|---|---|---|---|---|---|---|---|---|
| count | 9.542000e+03 | 9542.000000 | 9542.000000 | 9542.000000 | 9542.000000 | 9542.000000 | 9542.000000 | 9542.000000 |
| mean | 9.043301e+06 | 18.179208 | 64.274997 | 25.848532 | 1200.326137 | 1.804968 | 2.665238 | 156.772060 |
| std | 8.791967e+06 | 56.451600 | 41.197602 | 11.010094 | 16128.743876 | 0.905563 | 1.516588 | 430.203324 |
| min | 5.300000e+01 | 1.000000 | -157.948486 | -41.330428 | 0.000000 | 1.000000 | 0.000000 | 0.000000 |
| 25% | 3.019312e+05 | 1.000000 | 77.081565 | 28.478658 | 250.000000 | 1.000000 | 2.500000 | 5.000000 |
| 50% | 6.002726e+06 | 1.000000 | 77.192031 | 28.570444 | 400.000000 | 2.000000 | 3.200000 | 31.000000 |
| 75% | 1.835260e+07 | 1.000000 | 77.282043 | 28.642711 | 700.000000 | 2.000000 | 3.700000 | 130.000000 |
| max | 1.850065e+07 | 216.000000 | 174.832089 | 55.976980 | 800000.000000 | 4.000000 | 4.900000 | 10934.000000 |
Conclusion of Descriptive Statistics
- Average Cost for Two :
- There's a high variance in average cost, a mix of both budget and luxury restaurants.
- The extreme max value of two 800,000$ could be an outlier.
- Price Range :
- Price range scale 1 to 4
- Most frequent either 1 or 2
- Most restaurants fall between low to mid-range pricing with few high-end options.
- Aggregate Rating :
- The average rating is below 3, suggesting that many restaurants have moderate to poor ratings.
- The presence of ratings indicates mixed customer rating.
- Votes :
- Most restaurants have low customer review counts, but a few have high popularity (10,934 votes).
- The high standard deviation suggests some restaurants get significantly more attention than others.
In [12]:
# Check Unique values
unique_dict = dict()
unique_count = dict()
# create a for loop to store unique values in dictionary
for i in list(refine_data.columns):
unique_dict.update({i : refine_data[i].unique()})
unique_count.update({i : len(refine_data[i].unique())})
In [13]:
# Unique Value Counts Visual Analysis
plt.figure(figsize=(10,8))
plt.title(f"Attribute Distribution in Restaurant Data")
ax = plt.bar(unique_count.keys(), unique_count.values(), color = 'salmon')
plt.bar_label(ax,labels=unique_count.values())
plt.plot(list(unique_count.keys()), list(unique_count.values()), color = 'slategray', linestyle='dashed', linewidth=2)
plt.xticks(rotation=90)
plt.xlabel("Restaurant Features")
plt.ylabel("No. of unique value")
plt.show()
Graph Analysis Summary
This bar chart visualizes the number of unique values for each feature in the restaurant dataset.
- Restaurants span 15 different countries and cities with varity of cuisines.
- Each restaurant has a unique identifier but some duplicates or chains.
- Table Booking, Online Delivery indicate limited choices
Graphical analysis¶
Geographical Map visualization¶
In [14]:
# Create map centered on median coordinates
center_lat = refine_data['Latitude'].median()
center_long = refine_data['Longitude'].median()
restaurant_map = folium.Map(location=[center_lat, center_long], zoom_start=11)
# Add clustered markers
marker_cluster = MarkerCluster().add_to(restaurant_map)
for _, row in refine_data.iterrows():
folium.Marker(
location=[row['Latitude'], row['Longitude']],
popup=f"{row['Restaurant Name']} ({row['City']})",
tooltip=row['Cuisines']
).add_to(marker_cluster)
print("\nNumber of the Resturant on the location \n\n")
restaurant_map
Number of the Resturant on the location
Out[14]:
Make this Notebook Trusted to load map: File -> Trust Notebook